Back

Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences

The Royal Society

Preprints posted in the last 7 days, ranked by how well they match Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences's content profile, based on 15 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
A geometric-surface PDE model for cell-nucleus translocation through confinement

Ballatore, F.; Madzvamuse, A.; Jebane, C.; Helfer, E.; Allena, R.

2026-04-17 biophysics 10.64898/2025.12.18.695144 medRxiv
Top 0.2%
3.0%
Show abstract

Understanding how cells migrate through confined environments is crucial for elucidating fundamental biological processes, including cancer invasion, immune surveillance, and tissue morphogenesis. The nucleus, as the largest and stiffest cellular organelle, often limits cellular deformability, making it a key factor in migration through narrow pores or highly constrained spaces. In this work, we introduce a geometric surface partial differential equation (GS-PDE) model in which the cell plasma membrane and nuclear envelope are described as evolving energetic closed surfaces governed by force-balance equations. We replicate the results of a biophysical experiment, where a microfluidic device is used to impose compressive stresses on cells by driving them through narrow microchannels under a controlled pressure gradient. The model is validated by reproducing cell entry into the microchannels. A parametric sensitivity analysis highlights the dominant influence of specific parameters, whose accurate estimation is essential for faithfully capturing the experimental setup. We found that surface tension and confinement geometry emerge as key determinants of translocation efficiency. Although tailored to this specific setup for validation purposes, the framework is sufficiently general to be applied to a broad range of cell mechanics scenarios, providing a robust and flexible tool for investigating the interplay between cell mechanics and confinement. It also offers a solid foundation for future extensions integrating more complex biochemical processes such as active confined migration.

2
Informing Epidemic Control Strategies: A Spatial Metapopulation Model Incorporating Recurrent Mobility, Clustering, and Group-Structured Interactions

Smah, M. L.; Seale, A.; Rock, K.

2026-04-11 infectious diseases 10.64898/2026.04.08.26350398 medRxiv
Top 0.2%
2.7%
Show abstract

Infectious disease dynamics are strongly shaped by human mobility, social structure, and heterogeneous contact patterns, yet many epidemic models do not jointly capture these features. This study develops a spatial metapopulation epidemic model incorporating recurrent group-switch interactions to represent real-world transmission processes. Building on the Movement-Interaction-Return framework, the model integrates household structure, age-stratified contacts, and mobility between locations within a single SEIR framework. Using UK demographic, mobility, and social contact data, the model quantifies how within- and between-group interactions, mobility rates, and location connectivity influence epidemic spread. Both deterministic and stochastic simulations are implemented to analyse outbreak dynamics, variability, and fade-out probabilities for COVID-19-like and Ebola-like infections. Results shows that highly connected locations drive faster transmission, earlier epidemic peaks, and greater difficulty in containment, whereas larger but less connected locations tend to produce slower, more localised outbreaks despite their population size. Comparative analysis reveals that COVID-19-like infections spread rapidly and remain difficult to control even under interventions, while Ebola-like infections exhibit slower dynamics and are more effectively contained, particularly under targeted measures. Non-pharmaceutical interventions, particularly widespread closures, substantially reduce infections, hospitalisations, and deaths, although effectiveness depends on timing and pathogen characteristics. These findings highlight the importance of integrating mobility, clustering, and demographic heterogeneity to inform targeted and effective epidemic control strategies.

3
A Multi-Clique Network Model for Epidemic Spread with Fully Accessible Within-Group and Limited Between-Group Contacts

Smah, M. L.; Seale, A. C.; Rock, K. S.

2026-04-11 infectious diseases 10.64898/2026.04.08.26350390 medRxiv
Top 0.8%
0.8%
Show abstract

Network-based epidemic models have been instrumental in understanding how contact structure shapes infectious disease dynamics, yet widely used frameworks such as Erd[o]s-Renyi, configuration-model, and stochastic block networks do not explicitly capture the combination of fully accessible (saturated) within-group interactions and constrained between-group connectivity characteristic of many real-world settings. Here, we introduce the Multi-Clique (MC) network model, a generative framework in which individuals are organised into fully connected cliques representing stable contact groups (e.g., households, classrooms, or workplaces), with a limited number of external connections governing inter-group transmission. Using stochastic susceptible-infectious-recovered (SIR) simulations on degree-matched networks, we compare epidemic dynamics on MC networks with those on classical random graph models. Despite having an identical mean degree, MC networks exhibit systematically distinct behaviour, including slower epidemic growth, reduced peak prevalence, increased fade-out probability, and delayed time to peak. These effects arise from rapid within but constrained between clique transmission, creating structural bottlenecks that standard models do not capture. The MC framework provides an interpretable, data-driven representation of recurrent contact structure, with parameters that map directly to observable quantities such as household and classroom sizes. By isolating the role of intergroup connectivity, the model offers a basis for evaluating targeted intervention strategies that reduce between-group mixing while preserving within-group interactions. Our results highlight the importance of explicitly representing the real-life clique-based network structure in epidemic models and suggest that classical degree-matched networks may systematically overestimate epidemic speed and intensity in structured populations.

4
Spine Reviews: Crowdsourcing Global Spine Expert Knowledge via Digital Ledger Technology

Challier, V.; Diebo, B.; Lafage, V.; Dehouche, N.; Lonjon, G.; Cristini, J.; SpineDAO,

2026-04-13 health informatics 10.64898/2026.04.11.26350678 medRxiv
Top 1%
0.5%
Show abstract

Study Design: Prospective observational study using a novel digital ledger technology (DLT)-based crowdsourcing platform. Objective: To develop and evaluate Spine Reviews, a blockchain-based platform for aggregating spine treatment recommendations from an international specialist panel, and to validate the clinical coherence of the resulting dataset. Summary of Background Data: Predictive models for low back pain treatment are limited by small, homogeneous datasets that fail to capture inter-clinician variability. Traditional multi-center data collection is expensive, slow, and geographically constrained. DLT-based crowdsourcing with cryptographic credentialing may overcome these barriers. Methods: Five hundred synthetic patient vignettes (digital twins) were generated; 463 retained after quality control. A review platform was built on the Solana blockchain using non-transferable Soulbound Tokens (SBTs) for credentialing and smart-contract compensation. Fifty-two specialists from 7 countries provided 4+ reviews per vignette across four treatment tiers, without access to imaging or physical examination. Mixed-effects regression with reviewer random intercepts partitioned decision variability. Results: The platform collected 2,066 completed reviews (97.7%) over 37 days at USD 0.97/review. Variance decomposition revealed that 36.7% of treatment tier variability was attributable to patient presentation, 19.2% to reviewer practice style, and 44.1% to their interaction. Neurological deficits (beta=0.39), symptom duration (beta=0.12), and pain (beta=0.09) independently predicted treatment escalation (all p<0.001). Gwet's AC1 was almost perfect for emergency (0.92) and substantial for conservative decisions (0.67). Reviewer confidence in treatment recommendations decreased with escalating tier severity (conservative 4.59/5 vs surgical 4.05/5), suggesting appropriate uncertainty calibration. Conclusions: DLT with SBT credentialing enables rapid, global, cost-effective aggregation of clinically coherent expert judgment. The three-component variance structure quantifies clinical equipoise in spine care and establishes that predictive models require diverse, multi-reviewer training data. Keywords: digital ledger technology; blockchain; crowdsourcing; clinical decision-making; low back pain; Soulbound Tokens

5
Noisy periodicity in tropical respiratory disease dynamics

Yang, F.; Hanks, E. M.; Conway, J. M.; Bjornstad, O. N.; Thanh, N. T. L.; Boni, M. F.; Servadio, J. L.

2026-04-13 epidemiology 10.64898/2026.04.10.26350660 medRxiv
Top 2%
0.3%
Show abstract

Infectious disease surveillance systems in tropical countries show that respiratory disease incidence generally manifests as year-round activity with weak fluctuations and irregular seasonality. Previously, using a ten-year time series of influenza-like illness (ILI) collected from outpatient clinics in Ho Chi Minh City (HCMC), Vietnam, we found a combination of nonannual and annual signals driving these dynamics, but with unknown mechanisms. In this study, we use seven stochastic dynamical models incorporating humidity, temperature, and school term to investigate plausible mechanisms behind these annual and nonannual incidence trends. We use iterated filtering to fit the models and evaluate the models by comparing how well they replicate the combination of annual and nonannual signals. We find that a model including specific humidity, temperature, and school term best fits our observed data from HCMC and partially reproduces the irregular seasonality. The estimated effects from specific humidity and temperature on transmission are nonlinearly negative but weak. School dismissal is associated with decreased transmission, but also with low magnitude. Under these weak external drivers, we hypothesize that stochasticity makes a strong sub-annual cycle more likely to be observed in ILI disease dynamics. Our study shows a possible mechanism for respiratory disease dynamics in the tropics. When the external drivers are weak, the seasonality of respiratory disease dynamics is prone to the influence of stochasticity.

6
Mechanistic Insights into Skin Sympathetic Nerve Activity Dynamics in Healthy Subjects Through a Two-Layer Signal-Analytical and Closed-Loop Physiological Modeling Framework

Lin, R.; Halfwerk, F. R.; Donker, D. W.; Tertoolen, J.; van der Pas, V. R.; Laverman, G. D.; Wang, Y.

2026-04-13 health informatics 10.64898/2026.04.11.26350680 medRxiv
Top 2%
0.3%
Show abstract

Objective: Skin sympathetic nerve activity (SKNA) has emerged as a promising non-invasive surrogate measure of sympathetic drive, but its relevant physiological characteristics remain ill-defined. This observational study aims to investigate its regulatory patterns during rest and Valsalva maneuver (VM) in healthy participants. Method: Using a two-layer strategy integrating signal analysis and physiological modelling, we analyzed data recorded from 41 subjects performing repeated VMs. The observational layer includes time-domain feature comparisons using linear mixed-effect models, and time-varying spectral coherence analysis. The mechanistic layer proposes a mathematical model to investigate whether baroreflex and respiratory modulation are sufficient to reproduce the observed HR and average SKNA (aSKNA) dynamics. Main Results: Mean integrated SKNA (iSKNA) showed more significant change than HRV for VM induced effects. We also found mean iSKNA increase during VM varies with BMI and sex. The coherence analysis indicated that iSKNA strongly synchronized with EDR under resting conditions. The proposed model successfully reproduced main characteristics of aSKNA dynamics, yielding a high median Pearson correlation coefficient of 0.80 ([Q1, Q3] = [0.60, 0.91]). In contrast, HR dynamics were only partially captured, with a median PCC of 0.37 ([Q1, Q3] = [0.16, 0.55]). These results likely suggest SKNA provides a more direct representation of sympathetic burst dynamics during VM in healthy subjects. Significance: This study provides convergent evidence that SKNA reflects known autonomic regulatory influences in healthy subjects. These findings strengthen the physiological interpretability of SKNA while clarifying its appropriate use as a practical biomarker of sympathetic function.

7
Individualised evoked response detection based on the spectral noise colour

Undurraga Lucero, J. A.; Chesnaye, M.; Simpson, D.; Laugesen, S.

2026-04-13 health informatics 10.64898/2026.04.11.26350685 medRxiv
Top 2%
0.2%
Show abstract

Objective detection of evoked potentials (EPs) is central to digital diagnostics in hearing assessment and clinical neurophysiology, yet current approaches remain time-intensive and sensitive to inter-individual noise variability. Many existing detection methods rely on population-based assumptions or computationally demanding procedures, limiting robustness and efficiency in real-world clinical settings. We present Fmpi, a digital EP detection framework enabling individualised, real-time response detection through analytical modelling of the spectral colour and temporal dynamics of background noise within each recording. Using extensive simulations and large-scale human electroencephalography datasets spanning brainstem, steady-state, and cortical EPs recorded in adults and infants, we demonstrate performance comparable or superior to state-of-the-art bootstrapped methods while operating at a fraction of the computational cost and maintaining well-controlled sensitivity with improved specificity. Importantly, Fmpi incorporates a futility detection mechanism enabling early termination of uninformative recordings, reducing testing time without compromising diagnostic reliability.

8
Validated Synthetic Data Generation from a Multicenter Spine Surgery Registry: Methodology and Benchmark

Challier, V.; Jacquemin, C.; Diebo, B.; Dehouche, N.; Denisov, A.; Cristini, J.; Campana, M.; Castelain, J.-E.; Lonjon, G.; Lafage, V.; Ghailane, S.; SpineDAO Collaborative Group,

2026-04-11 health informatics 10.64898/2026.04.07.26350316 medRxiv
Top 3%
0.2%
Show abstract

BackgroundSynthetic data have emerged as a complementary strategy for secondary use of clinical registries, enabling data sharing without patient-level exposure. In spine surgery, multicenter data sharing is constrained by institutional governance and patient privacy regulations. Validated synthetic data generation may enable broader access to surgical outcomes data for artificial intelligence development without compromising patient confidentiality. ObjectiveTo describe and benchmark a three-domain validated synthetic data pipeline applied to a multicenter, tokenized spine surgery registry (SpineBase), and to establish a reproducible certification framework for synthetic spine surgery datasets. MethodsWe extracted 125 sacroiliac joint fusion cases from the SpineBase registry (SIBONE study, IRB-SOFCOT approval Ref. 14-2025; CNIL MR-004 Ref. 2234503 v 0). A GaussianCopula generative model was trained on 52 structured variables spanning demographics, preoperative assessments, operative details, and longitudinal outcomes at 3, 6, 12, and 24 months. Synthetic datasets of 100, 1,000, and 10,000 patients were generated. Validation followed a three-domain framework: (1) fidelity, assessed by Kolmogorov-Smirnov tests and Jensen-Shannon divergence; (2) utility, assessed by train-on-synthetic, test-on-real (TSTR) methodology; and (3) privacy, assessed by nearest-neighbor distance ratio (NNDR), membership inference attack, and k-anonymity proxy. ResultsAll three validation gates passed. Fidelity: mean KS p-value 0.52 (threshold >0.05). Privacy: NNDR >1.0 in 98.9% of synthetic records; membership inference AUROC 0.57. Utility: 12-month Oswestry Disability Index prediction yielded Pearson r = 0.29, consistent with expected attenuation at N = 125. A SHA-256 cryptographic hash of each certified dataset was anchored on the Solana blockchain for immutable provenance. ConclusionsA validated, blockchain-anchored synthetic data pipeline for spine surgery registries is technically feasible and meets current publication-standard criteria for fidelity and privacy. Utility metrics scale with registry size, creating a direct incentive for multicenter data contribution. This framework provides a reproducible methodology for synthetic data certification in spine surgery research, and establishes certified synthetic datasets as a privacy-native substrate for expert-annotation pipelines -- as demonstrated in the companion Spine Reviews study.

9
The single item physical activity (SIPA) measure: a major role for global surveillance and community program evaluation

Bauman, A.; Owen, K.; Messing, S.; Macdonald, H.; Nettlefold, L.; Richards, J.; Vandelanotte, C.; Chen, I.-H.; Cullen, B.; van Buskirk, J.; van Itallie, A.; Coletta, G.; O'Halloran, P.; Randle, E.; Nicholson, M.; Staley, K.; McKay, H. A.

2026-04-16 public and global health 10.64898/2026.04.14.26350895 medRxiv
Top 3%
0.1%
Show abstract

Military aviation training noise remains understudied despite its widespread impacts across urban, rural, and wilderness areas. The predominance of low-frequency noise and repetitive training can create pervasive noise pollution, yet past research often fails to capture the full range of health and quality-of-life effects. This study analyzed two complaint datasets related to Whidbey Island Naval Air Station noise: U.S. Navy records (2017-2020) and Quiet Skies Over San Juan County data (2021-2023). We analyzed and mapped sentiment intensity from noise complaints relative to modeled annual noise exposure, developed a typology to classify impacts, and modeled the environmental and operational factors influencing complaints. Findings revealed widespread negative sentiment and anger, often beyond the bounds of estimated noise contours, suggesting that annual cumulative noise models inadequately estimate community impacts. Complaints consistently highlighted sleep disturbance, hearing and health concerns, and compromised home environments due to shaking, vibration, and disruption of daily life. Residents also reported significant social, recreational, and work disruptions, along with feelings of fear, helplessness, and concern for children's well-being. The number of complaints were strongly associated with training schedules, with late-night sessions being the strongest predictor. A delayed response pattern suggests residents reach a frustration threshold before filing complaints. Overall, our findings demonstrate persistent negative sentiment and diverse impacts from military aviation noise. Results highlight the need for improved noise metrics, modeling and operational adjustments to mitigate the most disruptive effects.

10
Trade-offs in emergency transport protocols for access to hip fracture management: a geospatial analysis of selective versus standard transfer in Ontario long-term care

Yee, N. J.; Chen, T.; Huang, Y. Q.; Whyne, C.; Halai, M.

2026-04-14 orthopedics 10.64898/2026.04.12.26350713 medRxiv
Top 4%
0.1%
Show abstract

Objectives: For suspected hip fractures, prehospital protocols directing patients to an orthopaedic centre rather than the nearest emergency department (ED) could reduce time-to-surgery but may impact EMS travel burden. This study evaluates the impact of transfer protocols by quantifying transport to hospitals from long term care (LTC) facilities across Ontario. Methods: A retrospective cross-sectional analysis of all Ontario LTC facilities and hospitals was performed. Two protocols were modeled: standard transfer to the nearest ED with subsequent transfer if required, and selective transfer based on Collingwood Hip Fracture Rule prehospital screening1 directly to the nearest orthopaedic services (orthoED). Median one-way travel distances were calculated from Google Maps. Results: In Ontario, 15.4% of LTC residents require hospital destination decisions because their nearest ED lacks orthopaedic services; for these facilities, median distances were 2.7km to the ED and 36.0km to the orthoED. Among the 52 LTC facilities where selective transfer was distance-optimal, it substantially reduced travel for patients with hip fracture (31.1km vs 49.6km; P<.01) while only modestly increasing travel for patients without hip fracture. Where standard transfer was distance-optimal, little travel difference was noted for patients with hip fracture, however false positive screened patients traveled significantly further to an orthoED. Greatest negative consequences of selective transfer lie in the 1.3% of residents living farthest (>100km) from an orthoED. Conclusions: EMS direct transportation to hospitals with orthopaedics may improve hip fracture care but can increase EMS burden due to patients identified falsely as having a hip fracture, particularly in remote communities.

11
Bridging the Awareness Utilisation Gap in Reusable Menstrual Product Use Among Female Medical Students and Healthcare Professionals: A Cross-Sectional Study

Wami-Amadi, C. F.; Nonju, I. I.

2026-04-12 sexual and reproductive health 10.64898/2026.04.10.26350626 medRxiv
Top 4%
0.1%
Show abstract

Background: Reusable menstrual products provide sustainable and cost effective alternatives to disposable sanitary products; however, their adoption remains limited, even among healthcare professionals. Objectives: To assess awareness, knowledge, perceptions, and utilisation of reusable menstrual products among female medical students and healthcare professionals, and to identify predictors of willingness and use. Design: Cross sectional analytical study. Setting: An online survey was conducted among female medical students and healthcare professionals in Nigeria. Participants: A total of 203 female respondents aged 15 to 55 years. Intervention: Not applicable. Primary Outcome Measures: Utilisation of reusable menstrual products and willingness to adopt their use. Secondary Outcome Measures: Awareness, knowledge, perceptions, and barriers. Methods: Data were collected using a structured questionnaire and analysed using descriptive statistics, chi square tests, and logistic regression. Results: Awareness was high (96.06%), but utilisation was low, with 5.42% ever using and 4.43% currently using reusable products. About 31.53% were willing to use them. Respondent type was not associated with willingness (p = 0.735), although healthcare professionals had higher knowledge (p = 0.024). Positive perception predicted willingness (AOR = 7.58, 95% CI: 3.18 to 18.03, p < 0.001). Good knowledge (AOR = 14.96, p = 0.014) and increasing age (AOR = 1.28, p = 0.004) predicted utilisation. Conclusion: Despite high awareness, utilisation remains low. Perception influences willingness, while knowledge drives use. Targeted behavioural and educational interventions are needed. Keywords: Menstrual hygiene, reusable menstrual products, menstrual cup, sustainability, healthcare professionals

12
A case report on gendered biases in a Finnish healthcare AI assistant

Luisto, R.; Snell, K.; Vartiainen, V.; Sanmark, E.; Äyrämö, S.

2026-04-14 health informatics 10.64898/2026.04.09.26350383 medRxiv
Top 5%
0.1%
Show abstract

In this study, we investigate gender bias in a Retrieval-Augmented Generation (RAG) based AI assistant developed for Finnish wellbeing services counties. We tested the system using 36 clinically relevant queries, each rendered in three gendered variants (male, female, gender-neutral), and evaluated responses using both an LLM-as-a-judge approach and a human expert panel consisting of a physician and a sociologist specializing in ethics. We observed substantial and clinically significant differences across gendered variants, including differential treatment urgency, inappropriate symptom associations, and misidentification of clinical context. Female variants disproportionately framed responses around childcare and reproductive health regardless of clinical relevance, reflecting societal stereotypes rather than medical reasoning. Bias manifested both at the LLM generation stage and the RAG retrieval stage, in several cases causing the model to hallucinate responses entirely. Some bias patterns were persistent across repeated runs, while others appeared inconsistently, highlighting the challenge of distinguishing systematic bias from stochastic variation.

13
Attitudes and Perceptions of Generative Artificial Intelligence Chatbots in the Scientific Process of Traditional, Complementary, and Integrative Medicine Research: A Large-Scale, International Cross-Sectional Survey

Ng, J. Y.; Tan, J.; Syed, N.; Adapa, K.; Gupta, P. K.; Li, S.; Mehta, D.; Ring, M.; Shridhar, M.; Souza, J. P.; Yoshino, T.; Lee, M. S.; Cramer, H.

2026-04-15 health informatics 10.64898/2026.04.13.26350612 medRxiv
Top 5%
0.1%
Show abstract

Background: Generative artificial intelligence (GenAI) chatbots have shown utility in assisting with various research tasks. Traditional, complementary, and integrative medicine (TCIM) is a patient-centric approach that emphasizes holistic well-being. The integration of TCIM and GenAI presents numerous key opportunities. However, TCIM researchers' attitudes toward GenAI tools remain less understood. This large-scale, international cross-sectional survey aimed to elucidate the attitudes and perceptions of TCIM researchers regarding the use of GenAI chatbots in the scientific process. Methods: A search strategy in Ovid MEDLINE identified corresponding authors who were TCIM researchers. Eligible authors were invited to complete an anonymous online survey administered via SurveyMonkey. The survey included questions on socio-demographic characteristics, familiarity with GenAI chatbots, and perceived benefits and challenges of using GenAI chatbots. Results were analysed using descriptive statistics and thematic content analysis. Results: The survey received 716 responses. Most respondents reported familiarity with GenAI chatbots (58.08%) and viewed them as very important to the future of scientific research (54.37%). The most acknowledged benefits included workload reduction (74.07%) and increased efficiency in data analysis/experimentation (71.14%). The most frequently reported challenges involved bias, errors, and limitations. More than half of the respondents (57.02%) expressed a need for training to use GenAI chatbots in the scientific process, alongside an interest in receiving training (72.07%). However, 43.67% indicated that their institutions did not offer these programs. Discussion: By developing a deeper understanding of TCIM researchers' perspectives, future AI applications in this field can be more informed, and guide future policies and collaboration among researchers.

14
Apnea-hypopnea index estimation with wrist-worn photoplethysmography

Fonseca, P.; Ross, M.; van Meulen, F.; Asin, J.; van Gilst, M. M.; Overeem, S.

2026-04-11 health informatics 10.64898/2026.04.08.26350411 medRxiv
Top 5%
0.1%
Show abstract

ObjectiveLong term monitoring of obstructive sleep apnea (OSA) severity may be relevant for several clinical applications. We developed a method for estimating the apnea-hypopnea index (AHI) using wrist-worn, reflective photoplethysmography (PPG). ApproachA neural network was developed to detect respiratory events using PPG and PPG-derived sleep stages as input. The development database encompassed retrospective data from three polysomnographic datasets (N=3111), including a dataset with concurrent reflective PPG recordings from a wrist-worn device (N=969). The model was pre-trained with (transmissive) finger-PPG signals from all overnight recordings and then fine-tuned to wrist-PPG characteristics using transfer learning. Validation was performed on the test portion of the development set and on a fourth, external hold-out dataset containing both wrist-PPG and PSG data (N=171). Performance was evaluated in terms of AHI estimation accuracy and OSA severity classification. Main ResultsThe fine-tuned wrist-PPG model demonstrated strong agreement with the PSG-derived gold-standard AHI, achieving intra-class correlation coefficients of 0.87 in the test portion of the development set and 0.91 in the external hold-out validation set. Diagnostic performance was high, with accuracies above 80% for all severity thresholds. SignificanceThe study highlights the potential of reflective PPG-based AHI estimation, achieving high estimation performance in comparison with PSG. These measurements can be performed with relatively comfortable sensors integrated in convenient wrist-worn wearables, enabling long-term assessment of sleep disordered breathing, both in a diagnostic phase, and during therapy follow-up.

15
Cochrane Evaluation of (Semi-) Automated Review (CESAR) Methods: Protocol for an adaptive platform study within reviews

Gartlehner, G.; Banda, S.; Callaghan, M.; Chase, J.-A.; Dobrescu, A.; Eisele-Metzger, A.; Flemyng, E.; Gardner, S.; Griebler, U.; Helfer, B.; Jemiolo, P.; Macura, B.; Minx, J. C.; Noel-Storr, A.; Rajabzadeh Tahmasebi, N.; Sharifan, A.; Meerpohl, J.; Thomas, J.

2026-04-15 health informatics 10.64898/2026.04.13.26350802 medRxiv
Top 5%
0.1%
Show abstract

Background: Artificial intelligence (AI) has the potential to improve the efficiency of evidence synthesis and reduce human error. However, robust methods for evaluating rapidly evolving AI tools within the practical workflows of evidence synthesis remain underdeveloped. This protocol describes a study design for assessing the effectiveness, efficiency, and usability of AI tools in comparison to traditional human-only workflows in the context of Cochrane systematic reviews. Methods: Members of the Cochrane Evaluation of (Semi-) Automated Review (CESAR) Methods Project developed an adaptive platform study-within-a-review (SWAR) design, modeled after clinical platform trials. This design employs a master protocol to concurrently evaluate multiple AI tools (interventions) against a standard human-only process (control) across three key review tasks: title and abstract screening, full-text screening, and data extraction. The adaptive framework allows for the addition or removal of AI tools based on interim performance analyses without necessitating a restart of the study. Performance will be assessed using metrics such as accuracy (sensitivity, specificity, precision), efficiency (time on task), response stability, impact of errors, and usability, in alignment with Responsible use of AI in evidence SynthEsis (RAISE) principles. Results: The study will generate comparative data about the performance and usability of specific AI tools employed in a semi- or fully automated manner relative to standard human effort. The protocol provides a flexible framework for the assessment of AI tools in evidence synthesis, addressing the limitations of static, one-time evaluations. Discussion: This study protocol presents a novel methodological approach to addressing the challenges of evaluating AI tools for evidence syntheses. By validating entire workflows rather than individual technologies, the findings will establish an evidence base for determining the viability of integrating AI into evidence-synthesis workflows. The adaptive design of this study is flexible and can be adopted by other investigators, ensuring that the evaluation framework remains relevant as new tools emerge.

16
Estimating the strength of symptom propagation from primary-secondary case pair data

Asplin, P.; Mancy, R.; Keeling, M. J.; Hill, E. M.

2026-04-13 infectious diseases 10.64898/2026.04.07.26350037 medRxiv
Top 5%
0.1%
Show abstract

Symptom propagation occurs when the symptoms of secondary cases are related to those of the primary case as a result of epidemiological mechanisms. Determining whether - and to what extent - symptom propagation occurs requires data-driven methods. Here we quantify the strength of symptom propagation as the increase in risk of a secondary case developing severe symptoms if the primary case has severe symptoms. We first used synthetic results to determine the data requirements to robustly estimate the strength of symptom propagation and to investigate the effect of severity-dependent reporting bias. Categorising symptom severity into two group (mild or severe; asymptomatic or symptomatic), our estimation requires only four summary statistics - the number of primary-secondary case pairs of each combination of symptom presentations. Our analysis showed that a relatively small number (100) of synthetic primary-secondary case pairs was sufficient to obtain a reasonable estimate of the strength of symptom propagation and 1,000 pairs meant errors were consistently small across replicates. Our estimates were robust to severity-dependent reporting bias. We also explored how symptom propagation can be separated from other individual-level factors affecting severity, using age dependence as an example. Although synthetic data generated from an age-structured model led to overestimations of the strength of symptom propagation, allowing disease severity to be age-dependent restored the accuracy of parameter estimation. Finally, we applied our methodology to estimate the strength of symptom propagation from three publicly available data collected during the COVID-19 pandemic with data on presence or absence of symptoms: England households, Israel households, and Norway contact tracing. Our age-free methodology indicated a 12-17% increase in the risk of being symptomatic if infected by someone symptomatic. Our positive estimates for the strength of symptom propagation persisted when applying our age-dependent methodology to the two household data sets with age-structured information (England and Israel). These findings demonstrate evidence for symptom propagation of SARS-CoV-2 and provide consistent estimates for its strength. Our synthetic data analysis supports the conclusion that these correlations are not a result of reporting bias or age-dependent effects. This work provides a practical tool for estimating the strength of symptom propagation that has minimal data requirements, enabling application across a wide range of pathogens and epidemiological settings.

17
A Retrospective Propensity Score Matched Cohort Study Comparing Intact Fish Skin Graft with Synthetic and Biosynthetic Dermal Substitutes for Acute Burn Injuries Requiring Dermal Substitution and Autografting: Outcomes from the American Burn Association Registry

Sood, R.; Hevelone, N. D.; Davidsson, O. B.; Kristjansson, R. P.; Phillips, B. D.; Lantis, J. C.; Johannsson, G.

2026-04-16 intensive care and critical care medicine 10.64898/2026.04.14.26350896 medRxiv
Top 6%
0.0%
Show abstract

Abstract Objective: The objective of this study was to compare hospital length of stay and other clinical outcomes between intact fish skin graft (IFSG; Graftguide, Kerecis, Arlington, VA) and synthetic/biosynthetic dermal substitutes (SSS; Integra Dermal Regeneration Template and NovoSorb Biodegradable Temporizing Matrix) in propensity score matched burn patients using the American Burn Association Burn Care Quality Platform. Methods: This retrospective cohort study identified adult patients treated with a single dermal substitute product during hospitalization for acute burn injury. Patients receiving IFSG (n = 93) were matched 1:4 to patients receiving SSS (n = 372) using nearest neighbor propensity score matching on the logit scale. Matching covariates included total body surface area burned (TBSA), patient age, sex), burn severity classification, inhalation injury, and trauma diagnosis. The primary outcome was hospital length of stay (LOS), analyzed using a gamma generalized linear mixed model (GLMM). Secondary outcomes included the incidences of sepsis, graft loss, venous thromboembolism (VTE), and hospital acquired pressure injury (HAPI). A prespecified sensitivity analysis was performed using a broader mixed product cohort. Results: A total of 93 IFSG treated patients from 17 burn centers admitted between the years 2019 and 2025 were matched 1:4 to 372 SSS treated patients from 44 centers. Unadjusted mean LOS was 24.1 days (median 20, IQR 11 to 32) in the IFSG treated group and 36.7 days (median 31, IQR 17 to 52) in the SSS treated group representing a 12.6 day reduction. GLMM-adjusted estimated marginal mean LOS was 24.2 days (95% CI, 20.0 to 29.4) for IFSG versus 33.5 days (95% CI, 30.0 to 37.6) for SSS (ratio 0.723; p = 0.00245), representing a 9.3 day reduction. Sepsis (1.1% vs 4.6%), graft loss (3.2% vs 8.3%), VTE (2.2% vs 2.7%), and HAPI (2.2% vs 3.8%) were all numerically lower in the IFSG treated arm; although GLMM-adjusted odds ratios were not statistically significant for any individual complication. The mixed cohort sensitivity analysis (n = 229 IFSG vs 458 SSS across 67 centers) confirmed the primary finding with GLMM adjusted LOS ratio 0.716 (p = 0.0001). Conclusions: In this propensity score matched analysis of the ABA registry, IFSG was associated with a statistically significant and clinically meaningful reduction in hospital length of stay compared with synthetic/biosynthetic dermal substitutes, in requiring dermal substitution and autografting, with all complication rates, sepsis, graft loss, VTE, and HAPI, numerically lower in the IFSG-treated arm. The shorter hospitalization was not achieved at the expense of safety. These findings support IFSG as a viable alternative to synthetic dermal substitutes in burns requiring dermal substitution and autografting. Prospective studies are warranted particularly in larger burns requiring staged reconstruction.

18
Leveraging State-of-the-Art LLMs for the De-identification of Sensitive Health Information in Clinical Speech

Dai, H.-J.; Mir, T. H.; Fang, L.-C.; Chen, C.-T.; Feng, H.-H.; Lai, J.-R.; Hsu, H.-C.; Nandy, P.; Panchal, O.; Liao, W.-H.; Tien, Y.-Z.; Chen, P.-Z.; Lin, Y.-R.; Jonnagaddala, J.

2026-04-17 health informatics 10.64898/2026.04.13.26349911 medRxiv
Top 6%
0.0%
Show abstract

Accurate recognition and deidentification of sensitive health information (SHI) in spoken dialogues requires multimodal algorithms that can understand medical language and contextual nuance. However, the recognition and deidentification risks expose sensitive health information (SHI). Additionally, the variability and complexity of medical terminology, along with the inherent biases in medical datasets, further complicate this task. This study introduces the SREDH/AI-Cup 2025 Medical Speech Sensitive Information Recognition Challenge, which focuses on two tasks: Task-1: Speech transcription systems must accurately transcribe speech into text; and Task-2: Medical speech de-identification to detect and appropriately classify mentions of SHI. The competition attracted 246 teams; top-performing systems achieved a mixed error rate (MER) of 0.1147 and a macro F1-score of 0.7103, with average MER and macro F1-score of 0.3539 and 0.2696, respectively. Results were presented at the IW-DMRN workshop in 2025. Notably, the results reveal that LLMs were prevalent across both tasks: 97.5% of teams adopted LLMs for Task 1 and 100% for Task 2. Highlighting their growing role in healthcare. Furthermore, we finetuned six models, demonstrating strong precision ([~]0.885-0.889) with slightly lower recall ([~]0.830-0.847), resulting in F1-scores of 0.857-0.867.

19
Democratizing Scientific Publishing: A Local, Multi-Agent LLM Framework for Objective Manuscript Editing

Bhansali, R.; Gorenshtein, A.; Westover, B.; Goldenholz, D. M.

2026-04-17 health informatics 10.64898/2026.04.13.26350761 medRxiv
Top 6%
0.0%
Show abstract

Manuscript preparation is a critical bottleneck in scientific publishing, yet existing AI writing tools require cloud transmission of sensitive content, creating data-confidentiality barriers for clinical researchers. We introduce the Paper Analysis Tool (PAT), a free, multi-agent framework that deploys 31 specialized agents powered by small language models (SLMs) to audit manuscripts across multiple quality dimensions without external data transmission. Applied to three published clinical neurological papers, PAT generated 540 evaluable suggestions. Validation by two expert reviewers (R.B., A.G.) confirmed 391 actionable, high-value revisions (90% agreement), achieving a 72.4% overall usefulness accuracy spanning methodological, statistical, and visual domains. Furthermore, deterministic re-evaluation of 126 agent-suggested rewrite pairs using Phase 0 metrics confirmed text improvement: total word count decreased by 25%, passive voice prevalence dropped sharply from 35% to 5%, average sentence length decreased by 24%, long-sentence fraction fell by 67%, and the Flesch-Kincaid grade improved by 17% . Our validation confirms that systematic, agent-driven pre-submission review drives measurable improvements, successfully converting manuscript optimization from an opaque, manual endeavor into a transparent and rigorous scientific process. Manuscript preparation is a critical bottleneck in scientific publishing, yet existing AI writing tools require cloud transmission of sensitive content, creating data-confidentiality barriers for clinical researchers. We introduce the Paper Analysis Tool (PAT), a free, multi-agent framework that deploys 31 specialized agents powered by small language models (SLMs) to audit manuscripts across multiple quality dimensions without external data transmission. Applied to three published clinical neurological papers, PAT generated 540 evaluable suggestions. Independent validation by two expert reviewers (R.B., A.G.) confirmed 391 actionable, high-value revisions (90% agreement), achieving a 72.4% overall usefulness accuracy spanning methodological, statistical, and visual domains. Furthermore, deterministic re-evaluation of 126 suggested Phase 0 rewrite pairs confirmed text improvement: total word count decreased by 25%, passive voice prevalence dropped sharply from 35% to 5%, average sentence length decreased by 24%, and long-sentence fraction fell by 67%, and the Flesch-Kincaid grade improved modestly. Our validation confirms that systematic, agent-driven pre-submission review drives measurable improvements, successfully converting manuscript optimization from an opaque, manual endeavor into a transparent and rigorous scientific process.

20
Spatial Decomposition of Longitudinal RNFL Maps Reveals Distinct Modes of Glaucomatous Progression with Structure Function and Genetic Signatures

Chen, L.; Zhao, Y.; Moradi, M.; Eslami, M.; Wang, M.; Elze, T.; Zebardast, N.

2026-04-11 health informatics 10.64898/2026.04.09.26350387 medRxiv
Top 6%
0.0%
Show abstract

Purpose: To determine whether spatial decomposition of longitudinal retinal nerve fiber layer (RNFL) change maps reveals distinct modes of glaucomatous progression masked by conventional averaging, and to validate these modes through structure function mapping and genetic association analysis. Methods: Pixel wise RNFL rates of change were computed from longitudinal optic disc OCT scans of 15,242 eyes (8,419 adults with primary open angle glaucoma [POAG]; Massachusetts Eye and Ear, 1998 to 2023). A loss only constraint zeroed all thickening values, reflecting the biological prior that adult RNFL does not regenerate. Nonnegative matrix factorization decomposed these maps into spatial progression components (80% training set). Components were evaluated in a heldout set (20%) for retinotopic structure function concordance, visual field (VF) progressor classification against global and quadrant RNFL rates, and enrichment of genetic association signals at established POAG loci. Results: Six anatomically distinct progression patterns emerged, including diffuse circumferential loss, focal peripapillary defects, and arcuate bundle degeneration. Pattern based models significantly outperformed global RNFL rate for classifying VF progressors (area under the curve, 0.750 [95% CI, 0.709 to 0.790] vs. 0.702; P = .0096) and explained additional variance in functional decline (Nagelkerke pseudoR2, 0.301 vs. 0.198; P = .0011). Structure function mapping confirmed retinotopic coherence. Spatial phenotypes recovered stronger genetic signals than global rates at 85.3% of established POAG loci, suggesting they capture more biologically homogeneous endophenotypes of progression. Conclusions: Glaucomatous structural progression occurs through spatially distinct modes with independent structure function and genetic signatures that conventional RNFL averaging obscures.